The OGI multi-language telephone speech corpus
نویسندگان
چکیده
The OGI Multi-language Telephone Speech Corpus is designed to support research on automatic language identi cation and multi-language speech recognition. The corpus consists of up to nine separate responses from each caller, ranging from single words to short topic-speci c descriptions to 60 seconds of unconstrained spontaneous speech. The utterances were spoken over commercial telephone lines by speakers of English, Farsi (Persian), French, German, Japanese, Korean, Mandarin Chinese, Spanish, Tamil, and Vietnamese. We have completed the initial phase of our data acquisition e ort: the recording and initial veri cation of utterances produced by 100 di erent speakers in each of the 10 languages. We describe the recording protocol, data collection procedure, ongoing corpus development, preliminary results of the statistical evaluation of the 10 languages, and plans to provide orthographic transcriptions of the speech.
منابع مشابه
Discriminative Training of GMM for Language Identificatio..
In this paper, a discriminative training procedure for a Gaussian Mixture Model (GMM) language identification system is described. The proposal is based on the Generalized Probabilistic Descent (GPD) algorithm and Minimum Classification Error Rates formulated to estimate the GMM parameters. The evaluation is conducted using the OGI multi-language telephone speech corpus. The experimental result...
متن کاملLanguage identification using acoustic log-likelihoods of syllable-like units
Automatic spoken language identification (LID) is the task of identifying the language from a short utterance of the speech signal uttered by an unknown speaker. The most successful approach to LID uses phone recognizers of several languages in parallel [Zissman, M.A., 1996. Comparison of four approaches to automatic language identification of telephone speech. IEEE Trans. Speech Audio Process....
متن کاملModeling prosody for language identification on read and spontaneous speech
This paper deals with an approach to Automatic Language Identification using only prosodic modeling. The actual approach for language identification focuses mainly on phonotactics because it gives the best results. We propose here to evaluate the relevance of prosodic information for language identification with read studio recording (previous experiment [1]) and spontaneous telephone speech. F...
متن کاملMulti-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds
The aim of this work is to exploit the acoustic-phonetic similarities between several languages. In recent work cross{ language HMM-based phoneme models have been used only for bootstrapping the language{dependent models and the multi{lingual approach has been investigated only on very small speech corpora. In this paper, we introduce a statistical distance measure to determine the similarities...
متن کاملAutomatic Language Identification Using a Segment - Based Approach 1
A segment-based Automatic Language Identi cation (ALI) system has been developed. The system was designed around a formal probabilistic framework. This framework forms the basis for investigating the ALI approach proposed by House and Neuburg which utilizes phonotactic constraints of languages. The system incorporates di erent components which model the phonotactic, prosodic, and acoustic prope...
متن کامل